94 research outputs found

    Bilingual distributed word representations from document-aligned comparable data

    Get PDF
    We propose a new model for learning bilingual word representations from non-parallel document-aligned data. Following the recent advances in word representation learning, our model learns dense real-valued word vectors, that is, bilingual word embeddings (BWEs). Unlike prior work on inducing BWEs which heavily relied on parallel sentence-aligned corpora and/or readily available translation resources such as dictionaries, the article reveals that BWEs may be learned solely on the basis of document-aligned comparable data without any additional lexical resources nor syntactic information. We present a comparison of our approach with previous state-of-the-art models for learning bilingual word representations from comparable data that rely on the framework of multilingual probabilistic topic modeling (MuPTM), as well as with distributional local context-counting models. We demonstrate the utility of the induced BWEs in two semantic tasks: (1) bilingual lexicon extraction, (2) suggesting word translations in context for polysemous words. Our simple yet effective BWE-based models significantly outperform the MuPTM-based and contextcounting representation models from comparable data as well as prior BWE-based models, and acquire the best reported results on both tasks for all three tested language pairs.This work was done while Ivan Vuli c was a postdoctoral researcher at Department of Computer Science, KU Leuven supported by the PDM Kort fellowship (PDMK/14/117). The work was also supported by the SCATE project (IWT-SBO 130041) and the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (648909)

    Heparin-containing block copolymers, Part I: Surface characterization

    Get PDF
    Newly synthesized heparin-containing block copolymers, consisting of a hydrophobic block of polystyrene (PS), a hydrophilic spacer-block of poly(ethylene oxide) (PEO) and covalently bound heparin (Hep) as bioactive block, were coated on aluminium, glass, polydimethylsiloxane (PDMS), PS or Biomer substrates. Surfaces of coated materials were characterized by transmission electron microscopy (TEM), contact angle measurements and X-ray photoelectron spectroscopy for chemical analysis (XPS). It was demonstrated by TEM that thin films of PS-PEO and PS-PEO-Hep block copolymers consisted of heterogeneous microphase separated structures. Using sessile-drop and Wilhelmy plate dynamic contact angle measurements, insight was provided into the hydrophilicity of the surfaces of the coatings. Measurements with hydrated coatings of PS-PEO and PS-PEO-Hep block copolymers revealed that the surfaces became more hydrophilic during immersion in water, due to relaxation/reorientation, or swelling of PEO or PEO-Hep domains, respectively. XPS results for PS, PEO, heparin and PS-PEO as powder agreed well with qualitative and quantitative predictions. XPS results for films of PS-PEO and PS-PEO-Hep block copolymers showed enrichments of PEO in the top layers of the coatings. This effect was more pronounced for hydrated surfaces. Only small amounts of heparin were detected at the surface of coatings of PS-PEO-Hep block copolymers

    Fully statistical neural belief tracking

    Get PDF
    This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST). The existing NBT model uses a hand-crafted belief state update mechanism which involves an expensive manual retuning step whenever the model is deployed to a new dialogue domain. We show that this update mechanism can be learned jointly with the semantic decoding and context modelling parts of the NBT model, eliminating the last rule-based module from this DST framework. We propose two different statistical update mechanisms and show that dialogue dynamics can be modelled with a very small number of additional model parameters. In our DST evaluation over three languages, we show that this model achieves competitive performance and provides a robust framework for building resource-light DST models

    Discriminating between lexico-semantic relations with the specialization tensor model

    Get PDF
    We present a simple and effective feed-forward neural architecture for discriminating between lexico-semantic relations (synonymy, antonymy, hypernymy, and meronymy). Our Specialization Tensor Model (STM) simultaneously produces multiple different specializations of input distributional word vectors, tailored for predicting lexico-semantic relations for word pairs. STM outperforms more complex state-of-the-art architectures on two benchmark datasets and exhibits stable performance across languages. We also show that, if coupled with a lingual distributional space, the proposed model can transfer the prediction of lexico-semantic relations to a resource-lean target language without any training data

    Explicit retrofitting of distributional word vectors

    Get PDF
    Semantic specialization of distributional word vectors, referred to as retrofitting, is a process of fine-tuning word vectors using external lexical knowledge in order to better embed some semantic relation. Existing retrofitting models integrate linguistic constraints directly into learning objectives and, consequently, specialize only the vectors of words from the constraints. In this work, in contrast, we transform external lexico-semantic relations into training examples which we use to learn an explicit retrofitting model (ER). The ER model allows us to learn a global specialization function and specialize the vectors of words unobserved in the training data as well. We report large gains over original distributional vector spaces in (1) intrinsic word similarity evaluation and on (2) two downstream tasks -- lexical simplification and dialog state tracking. Finally, we also successfully specialize vector spaces of new languages (i.e., unseen in the training data) by coupling ER with shared multilingual distributional vector spaces

    Bridging languages through images with deep partial canonical correlation analysis

    Get PDF
    We present a deep neural network that leverages images to improve bilingual text embeddings. Relying on bilingual image tags and descriptions, our approach conditions text embedding induction on the shared visual information for both languages, producing highly correlated bilingual embeddings. In particular, we propose a novel model based on Partial Canonical Correlation Analysis (PCCA). While the original PCCA finds linear projections of two views in order to maximize their canonical correlation conditioned on a shared third variable, we introduce a non-linear Deep PCCA (DPCCA) model, and develop a new stochastic iterative algorithm for its optimization. We evaluate PCCA and DPCCA on multilingual word similarity and cross-lingual image description retrieval. Our models outperform a large variety of previous methods, despite not having access to any visual signal during test time inference. Our code and data are available at: https://github.com/rotmanguy/DPCCA

    Scoring lexical entailment with a supervised directional similarity network

    Get PDF
    Scoring Lexical Entailment with a Supervised Directional Similarity NetworkERC Nvidi

    A systematic study of leveraging subword information for learning word representations

    Get PDF
    The use of subword-level information (e.g., characters, character n-grams, morphemes) has become ubiquitous in modern word representation learning. Its importance is attested especially for morphologically rich languages which generate a large number of rare words. Despite a steadily increasing interest in such subword-informed word representations, their systematic comparative analysis across typologically diverse languages and different tasks is still missing. In this work, we deliver such a study focusing on the variation of two crucial components required for subword-level integration into word representation models: 1) segmentation of words into subword units, and 2) subword composition functions to obtain final word representations. We propose a general framework for learning subword-informed word representations that allows for easy experimentation with different segmentation and composition components, also including more advanced techniques based on position embeddings and self-attention. Using the unified framework, we run experiments over a large number of subword-informed word representation configurations (60 in total) on 3 tasks (general and rare word similarity, dependency parsing, fine-grained entity typing) for 5 languages representing 3 language types. Our main results clearly indicate that there is no ``one-size-fits-all'' configuration, as performance is both language- and task-dependent. We also show that configurations based on unsupervised segmentation (e.g., BPE, Morfessor) are sometimes comparable to or even outperform the ones based on supervised word segmentation
    • …
    corecore